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ABSTRACT: The model representing two independent Poisson processes, labelled as "signal" and 
"background" and both contributing additively to the total number of counted events, is considered 
from a Bayesian point of view. This is a widely used model for the searches of rare or exotic 
events in presence of a background source, as for example in the searches performed by high- 
energy physics experiments. In the assumption of prior knowledge about the background yield, a 
reference prior is obtained for the signal alone and its properties are studied. Finally, the properties 
of the full solution, the marginal reference posterior, are illustrated with few examples. 
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1. Introduction 

Searches for rare events or faint signals are very common in scientific research. Here we consider 
only counting experiments, as for example underground detectors which measure the result of high- 
energy particle collisions, and we are interested in situations in which a known set of "background" 
processes contributes to the number of observed events, on top of which one looks for a possible 
excess which can be attributed to the faint signal. In these situations, most searches do not find 
evidences for a new signal and their results are summarized by providing upper limits to the signal 
intensity (see for example [jl|]). The analysis of the significance of an experimental result or of the 
expected significance of a planned measurement is a very important task, and the problem has been 
treated several times in the frequentist framework (see for example [^, ||]). Here the problem is 
approached from the Bayesian point of view, as it has been recently done by Demortier, Jain, and 
Prosper in Ref. [Q] (hereafter named the "DJP paper"), whose work stimulated the present study 
and a similar paper by Pierini, Prosper, Sekmen and Spiropulu al. ^ (hereafter named the "PPSS 
paper") 1 

'The PPSS paper appeared on arXiv few days before this paper, in which it has been referenced during the review 
process. They report about independent developments on a similar subject. 
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We assume that the integer number k = 0, 1, .. . of observed events follows a Poisson dis- 
tribution and that the signal and background sources are independently Poisson distributed with 
parameters s and b, such that the probability to observe k counts comes from a Poisson distribution 
with parameter s + b: P(k) = Poi(k\s + b). Here we are interested in the signal intensity s > 0, as 
it is practically always the case in the searches for new phenomena, and treat b > as a nuisance 
parameter. 

In the Bayesian framework, the result of the statistical inference is provided by the joint pos- 
terior probability density 

p(s,b\k)ocPoi(k\s + b)p(s,b) (1.1) 

where p(s,b) is the prior density which encodes the experimenter's degree of belief before incor- 
porating the results of the experiment with the information available before performing the experi- 
ment, and the likelihood function is the Poisson model Poi(k\s + b) itself, considered as a function 
of s + b for k fixed at the actual observation. The normalization constant can be found by imposing 
that the integral of p(s,b\k) is one. 

Here we look for a solution which only depends on the assumed model and the observed data, 
but not on any additional prior information, hence choose to follow the approach dictated by the 
so-called reference analysis [|6p. A key ingredient is the formulation of the reference prior 7t(s,b) 
[Q], defined as to maximize the amount of missing information, and the result of the inference is 



the reference posterior obtained by using n(s,b) in place of p(s,b) in the Bayes' theorem (|l.l|). 
Because we are not interested in making inference about b, we integrate the posterior over it to 
obtain the marginal posterior p(s\k) = J °° p(s,b\k)db, our final solution. This approach has been 
followed also by the DJP and PPSS papers. 

Although DJP considered signal and background as independent Poisson processes as in this 
work, they preferred a model in which the signal strength is multiplied by a parameter which should 
encode the uncertainty about both the signal efficiency and the integrated luminosity. 2 However, 
the background parameter is not multiplied by a similar quantity, although one expects to have 
some luminosity dependent factor here too. A better model would describe the probability of 
counting n events as the Poisson probability Poi(«|(£ v a + £/,/x)«Sf ) in which the signal cross-section 
a is multiplied by the signal efficiency e s , the background cross-section /J. is multiplied by the 
corresponding efficiency Eb, and the integrated luminosity Jz? explicitly appears, because it affects 
the same way signal and background yields (being 100% correlated). Such model would be quite 
complex: a is the only parameter of interest, hence the marginal posterior is obtained after the 
integration over the remaining four nuisance parameters. The prior for /J. would be a Gamma 
density as in the DJP paper (the Gamma density is the conjugate prior for the Poisson process), 
the priors for the efficiencies £ s ,£b would be Beta densities (the conjugate family for the binomial 
problem), and the prior for J?f would be a Gaussian (or a Gamma density peaked very far from 
zero, looking very similar to a Gaussian). For the purpose of the present work, this model is 
too complicated. To simplify things, we treat a model in which there are only two parameters 
s,b describing the expected numbers of events coming from the signal and background processes, 



2 This means that their signal and background strengths represent the cross-sections of the corresponding processes, 
whereas in this paper the two parameters represents the expected counts. 
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assumed to be mutually independent (and independent from other parameters which do not appear 
explicitly in the model, like the luminosity, as it is done in the PPSS paper). 



Here s and b are dimensionless numbers (not cross-sections) which are sufficient to describe 
in a stylized way the "discovery problem" in which one is interested in finding an excess of events 
over the background prediction. Qualitatively, in case a statistically significant excess is found, the 
next step is to evaluate the signal cross-section which corresponds to the measurement, which re- 
quires adopting the complicated model above, with a single parameter of interest and four nuisance 
parameters. Although in practice one works since the very beginning with a very complex model 
(with several parameters modelling the detector response and the theoretical uncertainties), from 
the mathematical point of view discovery and cross-section measurements can be considered two 
distinct phases, which can be treated in sequence. Here we consider the statistical approach which 
is sufficient for the discovery problem in a counting experiment for which the total uncertainty on 
the background yield is the only nuisance parameter, and defer the treatment of the complex model 
which is needed for estimating the cross-section to another paper. 

In this paper we do not address in details the general Bayesian approach to discovery, but 
focus only on the marginal posterior for the signal in presence of backgrond with unknown yield 
(which is a necessary but not sufficient ingredient). However, the Reader should be aware that 
there are several issues in the use of the "Bayes' factor" when taking a decision among two or 
more possible hypotheses when improper priors are used, as recently summarized by Berger [j|]. 
The widespread use of flat priors in Bayesian computations basically makes it impossible to take 
a decision based on the Bayes factor in a proper way, and this is true also for the reference prior 
in the Poisson model considered here (see below), which is improper. Instead, the two posterior 
solutions which correspond to the competing hypotheses should be compared. This approach has 
no hidden trap when improper priors are chosen, and the only difficulty is that there is no consensus 
about the threshold which the posterior ratio should exceed to claim a discovery, contrasting with 
the conventional frequentist "five-sigma" rule for the significance of the discrepancy with respect 
to the background-only hypothesis (which is not free of problems [Q]). A promising formal ap- 
proach based on the Bayesian decision theory is being proposed by Bernardo JIfj|], which is based 
on decision theory. One would choose the hypothesis which minimizes the expected posterior loss 
incurring when acting as if that hypothesis were correct. A convenient choice of a loss function 
which is parametrization invariant is the so-called "intrinsic discrepancy loss", defined as the min- 
imum among the two Kullback-Leibler directed divergences between two probability models [|TT|] . 
When averaged over the reference posterior, the reference posterior intrinsic loss has a scale which 
can be mapped onto the displacement with respect to the Gaussian mean. This way, a threshold 
which corresponds to the "five-sigma" rule, or any other minimal significance, can be defined. For 



more details, see [10] and the papers cited therein. 



In all searches for new phenomena there is quite a lot of information about the background, 
coming from several auxiliary measurements plus Monte Carlo simulations of the known physical 
processes. If this were not true, nobody would trust a discovery of a new signal based on the 
outcome of the experiment. Hence we assume that an informative prior for b is available which 
encodes the experimenter's degree of belief about the "reasonable" range of b and its "most likely" 



-3- 



values, written in the form of a Gamma density (the conjugate prior of the Poisson model): 



p(b) = G*(b\a,P) = ^b a - l e-P b (1.2) 

with shape parameter a > and rate parameter /3 > (or scale parameter = 1//3 > 0). In the 
simple (but frequent) case in which there is limited prior knowledge about b and only its expectation 
and variance are known, the Gamma parameters are determined by imposing E[b] = a/fi and 
V[b] = a/p 2 . 

Incidentally, one can note that it is quite common to use a Gaussian distribution to model 
the uncertainty on the yield of the background processes, although strictly speaking this is not 
the correct solution. The reason is that the density function should be defined on the domain of the 
parameter, which in this case is b € M + , whereas the Gaussian distribution is defined over the whole 
real axis. Hence, the normal distribution needs to be truncated at zero, which implies renormalizing 
it and recomputing its mean and standard deviation (when N(jc; ju, a) is truncated at zero, pL remains 
the peak position but is no more equal to the mean, and a is related to the right-width but is no 
more the standard deviation). When the distance (in units of a) between the peak and the origin 
is big enough, in practice one can proceed as if no truncation were necessary. However, this is 
often not the case, such that truncation does need to be considered. Because the domain of a 
random variable is the very first ingredient in the specification of a probability model, it appears 
more reasonable to limit the search for the background prior to the functions which are defined 
on the positive real semiaxis only. Two reasonable choices are the log-normal distribution and the 
Gamma density. The first may have a theoretical motivation when the uncertainty is attributed to the 
fluctuations of several additional independent contributions whose scale is uncertain. However, a 
Gamma density can mimic a log-normal distribution very well (and also a normal distribution when 
the latter is a good approximation) and has the advantage of belonging to the family of conjugate 
priors for the Poisson model, which implies that the posterior also belongs to the same family. This 
makes the choice of a conjugate prior most convenient, because simple relations exist between the 
parameters of the prior and posterior Gamma densities, which only depend on the observed data. 
It is worth emphasizing that this choice does not set any limit to the shape of the prior: a linear 
combination of Gamma densities can reproduce any shape. In this case, the posterior will be a 
linear combination, with the same weights, of the solutions which correspond to any individual 



Gamma density appearing in the prior, thanks to the linearity of the B ayes' theorem (|1.1|). 

Often, in particle-physics experiments the selection which is performed on the events has 
the goal of suppressing as much as possible the background contribution, while keeping a high 
efficiency for the signal. This often means that the background estimate has a sizable uncertainty, 
because the background events surviving the selection fall in the tails of the distributions of the 
physical observables, which would require the generation of enormous numbers of events to be 
well reproduced by Monte Carlo simulations. Hence the marginal prior density is often quite 
broad. In the opposite situation in which b is perfectly known, the prior ( [1^ ) degenerates in a 8 
function, which happens when a,j8 — > °° while keeping E [b] = a/P = bo constant (it is sufficient 
to set j8 = a/bo, which implies V[b] = bo/cc — > 0). In this case, it is easy to show (appendix |^) 
that the reference prior for s is Jeffreys' prior for s' = s + bo (i.e. bo simply redefines the origin of 
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the random variable): k(s) oc ( s + bo)^ 1 / 2 . In the following, we address the general problem with 
any value of a,j8 > 0. 

Because no prior knowledge is assumed for the signal parameter s, a reference prior n(s) for s 
is desired, such that the joint prior can be written in the form p(s,b) = p(s) p(b\s) = 7l(s)Ga(b\a,P). 
Two techniques for finding the reference prior when the conditional density of the other parameter 



is known are explained by Sun and Berger [12], although only the second one ("Option 2" in their 
paper) is applied here. Such technique can be summarized as follows. The starting point is the 
marginal model p(k\s), specifying the probability of counting k > events in the hypothesis that 
the signal yield is s > with the assumed knowledge about the background contribution: 



p(k\s) 



Voi{k\s + b)GiL{b\a,p)&b . 



(1.3) 



Next, this model is used to compute the Fisher's information 



7(5) 



3-logp(fcK 

OS 



(1.4) 



ds 2 



log p{k\s) 



where the last expression is valid for asymptotically normal models, as in our case. Finally, the 
reference prior for s is then n(s) °c |/(s)| 1/2 0. 

In the rest of the paper, these steps are considered in sequence. Section ^| illustrates the prop- 
erties of the marginal model p(k\s), which is used in section ||to compute the Fisher's information 
I(s). Finally, the reference prior n(s) is computed in section ^ and few examples of its application 
are shown in section |5[ 



2. The marginal model 



Here we find an explicit formula for the marginal model (1.3), in terms of the Poisson-Gamma 
mixture [|J, 



noo 

P(k\a,P)= / Poi(A:|0)Ga(0|a, j 8)d0 
Jo 

p a r{a + k) 



k\r(a){l+p) a+k 
a result which can be obtained by means of the Gamma integral 

x "- l e- x dx . 



First, we note that 



r(a) 



Poi(k\s + b) = e 



-b (s + bf 
kl 



k\ 



n=0 

k s^~ n b n 
f^ Q n\{k-n)\ 



(2.1) 



(2.2) 
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which allows to rewrite (1.3) in the form 



k e -s s k-n 



b n e -b 



^ (k-n)\Jo n\ 

e~ s s' 



Ga(b\a,j5)db 



(2.3) 



£ (k-n)\Jo 



Poi(n\b)Ga(b\a,l5)db 



which, with the help of (2.1), becomes 
p{k\s) -- 



J3 \ a X r(w + g)/r(g) e- s s k - n 



(2.4) 



The "rising factorial" (or Pochammer function) (a)W = r(n + a)/r(a) = a(a + l)(a + 
2) • • • (a + n — 1) is defined as 



(«) 



(») 



1 if n = 

a if n = 1 

(a)(* _1 )(o + ifc-l) ifn = fc>l 

and the ratio (a)W/n! defines the binomial coefficient ! ) with a upper real parameter, such 
that the the marginal model (2A) can be also written in the final form 

a 

e- s f(s;k,a,p). (2.5) 



p(k\s) 



1+0 



where the polynomial 



f{s;k,a,p) 



I 

n=0 



a + n — 1 



„k—n 



n J (k-n)\(\+p) n 



(2.6) 



has explicit forms given in table [j]. 

The function f(s;k,a,fi) has some interesting properties which are useful in the following 
treatment. The proofs are given in appendix [b|. Appendix ^ provides additional information which 
is useful when writing code to evaluate this polynomial. 

Property 1. For each a, j8 > and s > the series YX =0 f(s;k, a, j8) converges to 



This ensures that the the marginal model ( |2.5[ ) is properly normalized. 
Property 2. The n-th partial derivative of f(s;k, a, j8) with respect to s is 



f {n) (s;k,a,(3) 



if k<n 

1 if k = n 
f(s;k — n,a,p) if k>n 



(2.7) 
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0.2 



Marginal model: p(k|s) = J p(k|s+b) p(b) db 

o 




a: k = 1 , a = 3, p = 1 
b: k= 1,a = 5,p = 1 
c:k= 1,a= 10, p = 1 
d:k=10,a = 5,p = 1 
e:k=10,a = 3,p = 1 
f: k= 10,a = 1,p = 1 



Figure 1. Marginal likelihood for different background priors, with 1 and 10 observed counts. 



Property 2 makes it easy to compute all derivatives, the evaluation of a single function being 
sufficient. 



Property 3. For each finite n > 1, the sum YX=o 

fW(s;k,a,P) also converges to e s ( 1 + ) a / 1 



This ensures that the the expectation E\f^ n > //] = 1 for each n. 

The marginal likelihood is shown in figure [l], for different background parameters, in the case 
of 1 and 10 observed counts. As shown in details in the appendix |C[ the marginal model (2.5) 
coincides with the appropriate limit of the marginal model of the DJP paper and with the marginal 
model of the PPSS paper. 



f(s;k,a,p) 




1 

2 
3 

n>4 



1 

s + 



a 



1+/3 
sa 



+ 



a(a + 1) 



2 + l+p ' 2(1 + jS) 2 

„3 



+ 



s 2 a 



6 2(1+0) 2(l+j3) 2 



sa(a + l) a(a + l)(a + 2) 



+ 



•• + 



a 



„n—2 



+ ■ 



6(l+0) 3 

a(a + l) 



„n-3 



+ ■ 



a(a+ l)(a + 2) 



( n _l)!(l + fl) (n- 2)! 2(l+0) 2 ( B -3)! 3!(l+0) 3 
a(a + 1) ■ ■ ■ (a+n - 1) 

n!(l+0)« 

Table 1. Explicit forms of f(s;k,oc,P) for small values of^. 



+ •■• 
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3. The Fisher's information 



The logarithm of the marginal model (2.5) is 



log p (k | s) = a log 



1 + 



■s+logf(s;k,a,p) 



and its derivative with respect to s is 



dlog p(k\s) _ 1 | f(%;k,a,P) 
ds f(s;k,a,f5) 



while the second derivative is 

d 2 logp(k\s) f( 2 \s;k,ct,P) 



ds 2 



f(s;k,a,P) 



f(s;k,a,p) 



(3.1) 



(3.2) 



(3.3) 



The expectation in the definition of the Fisher's information (|1 .4[) splits in two terms 



lis) 



[fW(s;k,cc,P)] 2 
[f(s;k,a,P)] 2 



f{s;k,a,P) 



(3.4) 



which can be evaluated independently. The expectation E[f^ / f] is one by virtue of Property 3. 

\fU( S ;k,a,P)} 2 



The other term in (|3~4|) is 
E 



(/ (1) //) 2 



1+0/ to f&k,a,P) 



{k = =>p^ = hence start from k = 1; use Property 2} 

-,f [/(5;^-l,a,j3)] 2 



.1 + 
{now set n = — 1 } 


1 + 



jt=i 



f(s;k,a,p) 
. ~ [/fo; W ,q,j3)] 2 



which is easy to implement in a numerical routine, because one only needs a single evaluation of 
f(s;k,a,p) for each term in the sum (appendix |a|), in a loop which terminates when the current 
addendum gives a negligible contribution (for example, when the ratio between the addendum and 
the partial sum becomes less than 10~ 6 , as it was done in the examples considered in the next 
section). 

Finally, the Fisher's information is 

" [f(s;n,a,P)] 2 



I(s) 



1 + 



1 



(3.5) 



the same result being obtained when computing the expectation of the square of the first derivative 
(3.2) of the marginal log-likelihood, because E\f^> //] = 1 by virtue of Property 3. Appendix [d| 
shows that I(s) gives the correct reference prior (Jeffreys' prior) when there is certain prior knowl- 
edge about the background yield. 
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4. The reference prior 



The reference prior for s is n(s) <=c (/(.?) j 1 / 2 [|12|]. From equation (3.5) one obtains 



1/2 



1+0 



[f(s;n,a,f3)] 2 



1/2 



(4.1) 



which only requires the evaluation of the function f(s;n,a,P) (see appendix ^) to be computed 
once per cycle. 

When s — > oo the function (4.1) is not decreasing fast enough to make it integrable over the 
positive real axis. In the asymptotic regime, the leading term in the polynomial f(s;k,a,fi) is 
proportional to s k , which means that each addendum in the sum diverges as a polynomial of degree 
5 , and the exponential e~ s is not sufficient to ensure that the sum goes to zero for s — > oo (easier 
to check in log scale: \og(e~ s s n ) = — s + nlogs — > oo when both n and s go to oo). This means that 
n(s) is not a proper prior: it cannot represent somebody's degree of belief. In the framework of 
the Bayesian reference analysis, the use of improper reference priors is admitted, provided that the 
reference posterior is a proper probability density (which is the case here). 

For s — >■ the polynomial f(s;n, a, /3) reduces to a constant 



f(0;n,a,(3) 



a + n-l 
n 

a + n—\ 



(1+^)-" 
f(0;n-l,a,p) 



(4.2) 



which goes to zero for n — > oo fast enough to make the series in equation (4.1) converge. The 
function ^(s)^ 2 has its single maximum at zero, hence a possible definition of the reference prior 
for the signal is 

I'M 1 1/2 



n(s) 



(4.3) 



|/(0)|V2 

which is a monotonically decreasing function attaining its maximum value of one at zero. The 
reference prior n(s) is shown in figure ^| for different choices of background parameters. The 
maximum at s = becomes more pronounced for small values of a and large values of j8 , whereas 
large values of a and small values of j8 make the prior flat. As expected, combinations which 
have similar background expectation and variance give almost the same curve. In figure || such 
examples are: the (a,j8) pairs (0.1,3) and (1,30), which have E[b] = 0.03 and E[b] = 0.033, both 
with V [b] = 0.011; the pairs (0.1, 10) and (1, 100), both with £[6] = 0.01, which have V[b] = 0.001 
and V[b] = 0.0001; the pairs (1, 10) and (10, 100), both with E[b] = 0.1, which have V[b] = 0.01 
and V [b] =0.001. 



5. The marginal reference posterior 



The joint prior for our model is the improper function 7l(s,b) = 7i(s) p(b) where n(s) is defined 
by equation (43) (or any other function which is proportional to |/(5)| 1 / 2 ) and p(b) is the Gamma 
density (1.2) which encodes the experimenter's prior degree of belief about the background. 
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I Reference prior tt(s) | 



| Reference prior tc(s) | 



E l 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 I 1 I 1 1 1 1 I 1 1 


Jt(s)witha=0.1 and|3=0.1 

it(s)wilha=0.1 andp=0.3 

7t(s)wilha=0.1 andp=1 

jt(s)witha=0.1 andp=3 

it(s)wftha-0-1 and|3=10 

;r(s)witha=0.1 andp=30 

jt(s)witha=0.1 andp=1°0 






rTTT ItTT T ItI T tTtI T rTfT T. .r T T T TtTl TTtTt TTtI T TTI I TTT1 



_i i i i I i i i i I i , i i I i , i , I i i i , I i 


jt(s) with a=10 and |i=0.1 

%(s) witha=10and[i=0.3 

n(s) with a=1 and p=1 

ji(s) with a=1 and p=3 

jt(s) with a=10 and p=10 

n(s) with a=1 and (5=30 

jc(s) wi1ha=10 andp=100 




1 1 Ttl T TTtTt iTtl T rf! I TTTI ItTTT I 111 T ItTT tTtTT r T T T T 1 



Reference prior it(s) | 




;s)witha=1 andp=0.1 
;s)withce=1 andp=0.3 
s) with(x=1 and p=1 
s) with ct=1 and p=3 
.) with ct=1 and p 
.) with ct=1 and p 
.) witna=1 



Figure 2. Reference prior for the signal parame- 
ter s obtained with background parameters a = 0.1 
(top-left), a — 1 (bottom-left), a = 10 (top-right) and 
j3 =0.1,0.3,1,3,10,30,100. The top-down sequence 
of the functions is the same as the list in the legend of 
each plot. 



The joint reference posterior is proportional to p(s,b\k) oc Poi(k\s + b) p(b) 7t(s) and the joint 
marginal posterior for s is obtained after integration over b. Finally, because n(s) does not explicitly 
depend on b, the marginal posterior is proportional to the product of the reference prior ( |Q| ) and 
the marginal likelihood (2.5): 

PC*I*) - (j!fp ) e- s f(s;k,a,P)n(s). (5.1) 



In order to illustrate the properties of the marginal reference posterior (5.1), we choose exam- 
ples in which the prior expectation is small (E [b] = 2) and the number of observed events is small 
too (k = 0, 1, ... , 15 counts), and find the 68.3%, 90%, and 95% posterior credible intervals for the 
signal, choosing central intervals if they contain the posterior mode or upper limits otherwise (both 
are invariant under reparametrization). We consider uncertainties on the prior expectation of 10%, 
20%, 50%, and 100%, such that the shape and rate parameters of the background prior are listed 
in table |3] and the corresponding prior densities are shown in figure § The marginal model, for 
different choices of the background prior and sample size, is shown in figure |5[ The corresponding 
marginal posteriors are shown in figure |(| and their numerical summaries are provided by the tables 
g f§ and | of appendix g, in the form of left and right bounds of the 68.3%, 90%, 95% credible 
intervals, mean, median, mode, variance, skewness and excess kurtosis. 

One aspect which deserves some comment is the case of zero observed counts. Although the 
marginal posterior (5.1) does not explicitly depends on the expected background, it still contains 
it indirectly, due to the integration over the background prior. This implies that the upper limit 
in case of no counted events (as for any counts) does depend on the background prior. This has 
been verified with a scan of the prior parameters, choosing a background expectation of E[b] = 
0.5,1,2,4,8 counts and a relative uncertainty of 10%, 20%, 50%, 100%, and 150%. The results 
are shown in table |2[ which reports the posterior upper limit at 95% credibility level for zero counts 
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I Reference prior for the signal, with bkg = 2 



E[b] 


rel. unc. 


a 
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0.1 


100 
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12.5 
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1.0 


1 


0.5 




Figure 3. Parameters for the background priors. 



Figure 4. The background prior distributions. 



| Marginal model with bkg = 2 + 0.2 | 

T3 0.5 
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Figure 5. Marginal likelihood for the true signal, for different background priors (E [b] = 2 with relative 
uncertainty 10%, 20%, 50%, and 100% from top-left to bottom-right) and sample sizes (k = 0,1, . . . , 15). 



as a function of the prior background expectation and relative uncertainty. The upper limit moves 
right with increasing background expectation, because the posterior becomes broader. On the other 
hand, it has a small increase (negligible if E [b] is very small) for increasing relative uncertainties 
up to 50%, and then tends to decrease for increasing uncertainties, with a more pronounced trend 
for high background expectations. 

Quoting upper limits which depend on the prior background expectation in the case of zero 
observed counts produces a result which is similar to the classical results (summarized in [ffj]) 
but may appear suboptimal from the Bayesian point of view. The reason is that one important 
piece of information is ignored: in this case we know with certainty that there is no contribution 
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>, 0.5 




Figure 6. Marginal posteriors for the signal, for different background priors (E[b\ = 2 with relative uncer- 
tainty 10%, 20%, 50%, and 100% from top-left to bottom-right) and sample sizes (k = 0, 1, . . . , 15). 



from the background process, hence one can interpret the result using the simpler model in which 
the signal alone is described by a Poisson process. Using the reference prior (which is Jeffreys' 
prior) in this case gives the reference posterior p(s\0) = Ga (s; 5, 1) and the upper limit, found as 
the 0.95-quantile of the Gamma density, is 1 .92 counts, which is lower than all values reported in 
table ^| One would obtain the same result with the model presented here, in the limit of a prior 
delta-function for the background prior which gives a unit mass to the single point b = 0, because in 
this case one obtains the 1-dimensional Poisson model (by virtue of the result of appendix |D|). This 
upper limit does not depend on the prior knowledge about the background and looks a bit "aggres- 
sive", compared to the well known upper limit of 3 counts which is obtained in the 1-dimensional 
case both using a flat prior (which is falsely considered non-informative and leads to solutions 



which are not invariant under reparametrization) and the classical approach []15fl. On the other 
hand, it is the result of including all available information about the problem under consideration. 



E[b] 


Relative uncertainty on E [b] : 


I 


10% 


20% 


50% 


100% 150% 


0.5 


2.55 


2.55 


2.56 


2.57 2.55 


1.0 


2.65 


2.66 


2.68 


2.66 2.60 


2.0 


2.77 


2.78 


2.80 


2.74 2.62 


4.0 


2.88 


2.89 


2.91 


2.78 2.62 


8.0 


2.96 


2.98 


2.99 


2.81 2.62 



Table 2. Upper limits at 95% credibility level for zero observed counts. 
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Given the observed counts and the prior beliefs on the background values, the method de- 
scribed in this paper provides an objective Bayesian solution for the statistical inference about the 
true signal, the marginal reference posterior (5T). In the context of a specific analysis, it is im- 
portant to study the sampling properties of the solution. In the large sample size limit, Bayesian 
^-credible regions are always approximate confidence regions with coverage q and in some case 
they have even the exact coverage, as it happens for location-scale models [16]. However, the ac- 
tual measurement may be far from the asymptotic regime. In addition our model is discrete, such 
that the coverage is not exact (of course this is also true for frequentist solutions), even though the 
fluctuations around the coverage q become smaller and smaller with increasing sample sizes. 

A frequentist study of the coverage for different solutions of the Poisson problem, in which 
the reference posterior (which corresponds to the use of the Jeffreys' prior) is also included, is 
presented in Ref. We adopted a similar treatment to study few examples in which the small 
sample size is clearly far from the asymptotic limit, such that the coverage properties are not ex- 
pected to be ideal. Different possible observations (k = 0, 1, . . . , 15 counts) have been considered. 
The coverage properties are a function of five parameters: the true signal and background yields, 
the shape and rate parameters of the background prior, and the observed count. The first step is to 
average over the possible observations with the corresponding Poisson weights, which leaves four 
degrees of freedom. We report the results obtained with the four background priors which corre- 
spond to an expectation of E[b] =2 with relative uncertainty of 10%, 20%, 50%, and 100% (the 
actual study considered background expectations of 0.5, 1, 2, 4, and 8 counts and also included the 
case of 5% and 150% relative uncertainties, but the qualitative features are the same as the exam- 
ples presented here). For each prior, we still have two degrees of freedom left, the true signal and 
background yields. Hence the coverage can be shown in the form of 2-dimensional histograms. 

Figures ^, [|, || and 1C in appendix ^| show the coverage of 68.3%, 90% and 95% posterior 
credible intervals as a function of the true signal and background values. The diagonal structure is 
clearly due to the fact that we observe a single quantity, the number of counted events, which is the 
best estimator of the sum of true signal and background. This pattern is illustrated by the diagonal 
dotted lines, which characterize the loci with constant (and integer) sum of signal and background 
but different signal fraction. These figures also show the coverage as a function of the true signal 
alone, in the assumption that the true background coincides with the prior expectation (the ID plot 
is the slice of the 2D plot at its left along the horizontal dashed line). As expected, a wider prior 
uncertainty gives more conservative results, in the sense that the overall tendency is to overcover, 
apart from very small values of the true signal, for which there is undercoverage. Instead, a narrow 
prior leads to a more symmetrical distribution of the actual coverage about the nominal one (the 
horizontal line in the right plots). 

It is interesting to note what happens if the true signal is quite different from the prior ex- 
pectation. In figure |ll| (appendix |E|) the coverage of the solution which corresponds to a prior 
expectation of E[b] = 2 with 100% uncertainty is shown when the true background is half of it 
(left plots) or 50% bigger (right plots). The tendency to overcover is more pronounced when the 
background is smaller than expected (which comes at no surprise). When the true background is 
half standard deviation higher than expected, the pattern is the same as for the case in which it 
matches the expectation (right plots of figure [To]) but the fluctuations about the nominal coverage 
are less pronounced. 
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Other figures of merit for the solution are the false exclusion rate in case of 95% upper limits 
and the bias of the signal estimators. In appendix B the case of prior background expectation 



E[b] =2 counts with relative uncertainties of 10%, 50% and 100% are shown (figures 12, |lj, 14, 
[i"5[ , |16| ). The diagonal pattern is still visible in the plots of the false exclusion rate as a function of 
the true signal and background, but not in the bias plots. The false exclusion rate may be larger than 
5% when the prior uncertainty is small but this never happens when it is 100% or larger and it is 
very rare with 50% uncertainty. The bias of the different estimators, in all cases, is a monotonically 
decreasing function of the true signal. The mode tends to underestimate the true signal when the 
prior uncertainty is small, although bigger prior uncertainties tend to produce a more symmetric 
behaviour. The difference with respect to the true signal is usually less than one unit. On the other 
hand, both the mean and median tend to overestimate the true signal. Although the mean appears 
a bit more balanced than the mean, with a larger difference with respect to the true signal than the 
bias of the mode. In conclusion, the intuitive choice of the posterior peak as the best estimator for 
the true signal comes out to be the least biased estimator for small sample sizes. 

6. Summary and conclusion 

We considered the signal+background model in which both sources are independent Poisson pro- 
cesses, and there is available prior information about the background intensity b encoded in a 
Gamma density of the form ( |1.2[ ). Following the prescription by [O], the reference prior for the 



signal parameter s is computed from the conditional model (|2.5|). The reference prior is propor- 



tional to the square root of the Fisher's information (p.5|), and is not normalizable. This is not a 
problem, because the reference posterior is a proper density, and the same also happens when using 
a 5-function as the prior for the background. In this case the Jeffreys' prior 7c(s) <=< (s + bo)^ 1 ^ 2 
is an improper density too, which cannot represent somebody's degree of belief. The justification 
for the use of an improper prior is provided in the framework of the Bayesian reference analysis, 
in which the reference prior is defined as the mathematical device which maximizes the amount of 
missing information and does not represent an actual degree of belief [pp. 

Being an improper prior, one can always scale it by a constant: the normalization of the 
posterior density can be computed once the latter is known. For practical applications, we propose 



to adopt the improper prior (4.3) obtained by dividing the square root of the Fisher's information 
by the value which it assumes for s = 0. The resulting function assumes its maximum value (equal 
to one) at the origin and uniformly decreases for increasing values of s. Of course, one is also free 
to use equation ( pOl ) directly or to scale it by any other constant value. 

The joint prior for our model is the improper function 7t(s, b) = n{s) p(b) where n(s) is defined 



by equation (13) (or any other function which is proportional to |/(5)| 1 / 2 ) and p(b) is the Gamma 



density (|1 .2j) which encodes the experimenter's prior degree of belief about the background. Fi- 



nally, the marginal reference posterior is proportional to the product of the reference prior (13) and 



the marginal likelihood (2.5): 

a 



A few examples of the application of this solution have been shown in section |5[ together with their 
coverage, false exclusion rate, and the bias of different estimators of the true signal. 
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A. Computing the conditional model 



All results shown in this paper can be easily implemented once the function 



f(s;k,a,P) = 1 £ 

n=0 



n ) (k-n)\(l + p) n 



is available. Here we give some suggestion which lead to a fast evaluation of f(s;k, a, /?). 

Let us focus on each addendum first. It is better to work with the logarithms, because this 
avoids rounding problems related to expressions featuring very big and very small values: 



a + n- 1 



ck—n 



n ) (k-n)\(l+p) n 
'a + n- 1 
n 



log 



exp 

-nlog(l+j3)-log(Jfc-n)! 



+ (k-n)logs 



for s > 0. For s = one computes directly the constant term which corresponds to n = k in the sum 



from equation ( |4.2| ). 

Plotting or integrating f(s;k, a, j8) as a function of 5, possibly for different values of k, implies 
that a,/3 are held constant during the computation. We now consider all terms from left to right. 

The binomial coefficient is a function of n with a fixed which increases faster than a" and 
should be tabulated during the initialization phase. The following recursive relation holds: 



a + n — 1 
n 



1 



a 



a + n — 1 (a+n— 2 
n-l 



if 7i = 
if ti = 1 

if 7i > 2 



from which it follows that a(n; a) = log ( a+ " _1 ) = log[l + (a — l)/n] + a(n — 1; a). For example, 
one can start by saving into a C++ vector the a(n; a) values for ?i = 0, . . . , 100 and add more terms if 
needed later on, such that computing it during the loop over ti reduces to accessing the 7i-th element 
of the vector. 

The value of log s should be computed once, outside the sum. 

Because j3 is fixed, log(l + jS) is a constant term which should be only computed during the 
initialization phase, such that b(n;P) = 7ilog(l +j8) = 7i7j(l;j8) is just a multiplication step. For 
intensive computations (like summations over k = 0, . . . ,°° and/or integrations over s G [0,°°[) it 
can be useful to also store these values during the initialization phase. 

Finally, the term c(m) = log7n! = log m + c(m — 1), where m = k — n, should be also computed 
during the initialization phase (say for m = 0, . . . , 100) and stored into a vector for direct access 
inside the loop. 
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B. Properties of f(s;k, or, /3): Proofs 



Proof of property 1 

Property 3 is equivalent to the statement that E[l] = 1, which is explicitly shown here: 

a 



oo oo / o \ Gt 

£[1] = L^) = I Tjo e- s f( S ;k,a,f}) 



1+0 



„k—n 



a+n—l 



V 



1 



n \ cc oo 
r> \ OJ oo 



' 0n = (*-«)! V » 7(1+^)" 



a + ?i— 1\ 1 v 1 J 

a + n — \\f 1 
n 

1 



oo oo 



v /c=0n=0 ra=0n=/c, 

{m = k — n} 



\+bj [i-i/(i+js)]« 

where the following theorems have been used: 



: 1 



and 



I 

n=0 



a + n— 1 

77 



(!-*)« 



Proof of property 2 



We proceed by induction, showing that equation ( |2.7| ) holds for « = 1 and that if it holds for n — 1 
then it must also hold for n. For n = 1, by direct computation one obtains 



f^(s;k,a,B) = ^ 



k-l 



a + n- 1 



1 



I3>(k-n-l)\\ n J(l + P)» 

if k = 

1 if it = 1 
f(s;k-l,a,B) if /c>l 



The next step is to show that if the property ( J2.7| ) holds for n — 1 then it must also hold for n. 
For n < k this is trivial (because is zero), while for n > k 

d n d d"- 1 

^—f(s;k,a,B) = -r-- — T f(s;k,a,B) 

= j-f(s;k-n+l,a,B) 
= f(s;k — n,a,B) 

where the last passage comes from/ 1 ' (s ;m,a,B) = f(s;m— 1, a, B), in which m = k — n + \. This 
completes the proof. 



-17- 



Proof of property 3 

Property 3 follows directly from the previous ones: 



f {n) (s;k,a,P) 
f(s;k,a,P) 



f (n) 



f{n) 



i — () J I — n / \ i 



6 / Vi+^ 



£ T 



+ j3 



e- s fW(s;k,a,P) {*<„=>/(») =0} 



(B.l) 



{ra = & — «} 



= £[!] = ! 

C. Comparison with the DJP and PPSS papers 

The notation used in the DJP and PPSS papers is different from the one adopted here: 



DJP 


PPSS 


here 


DJP 


PPSS here 


a 


s 




b 


b J3 








y 


y « + i 




1 


1 


n 






model- 






bkg prior — 



The DJP model has one more parameter which is taken to be equal to 1 both here and in the 
PPSS paper. DJP wrote the Gamma density which represents such parameter in this form: 



ll(e) 



a(ae) x - l l 2 e-' 
T(jc+1/2) 



By formally setting a = x and taking the limit x — >■ oo this distribution degenerates in the delta- 
function 8(e — 1), which is what is considered here and by PPSS. 
The marginal model of DJP is 



/ a V+V2 / h \y+i/2 



±_Y +l/2 (_b V +1/2 
+ a) \b 

where one polynomial appears of the following family 



In the limit a = x —> °° the polynomials above become 



a + ay (b + l)"- k 



k m o k 



5 " (CI) ^£,( »-* 2 >/ ifc!(fr + l)'-* 



which in the notation of the PPSS paper coincides with e s T™{s). 
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In the same limit, the DJP marginal model becomes 

, , , -„( b V +,/2 v (n-k+y-i 

Mo)*- (srrj L{ n _ k 

In the notation of this paper, the expression above becomes 



k\(b+l) n ~ 



, I \ P \ K f (n-k + a-\\ s k 



{set m = n — k in the sum} 



p \ a » f m + a-l 

J+i) h, 



m=0 



m J (n-m) !(J3 + 1) 



= eS ('p~j\ f( s '> n , a ,P) <- eg. ([231) of section [2j 
In a similar way, one can write the marginal model in the PPSS paper in the form of our 



equation (2.5), because the function e s f(s;n, a, j8) is the same thing as the PPSS function T®(s). 



D. Model with a degenerate background prior 

If we have a certain knowledge about the background yield, the prior for b becomes a delta- 
function p(b) = 8(b — bo), which is the degenerate form of a Gamma density with parameters 
j3 = a /bo, a — > oo. In this case, the marginal model is 



whose logarithm is 



p(k\s)= / Poi(k\s + b)8(b-b )db = Poi(k\s + bo) 
Jo 



logp(k\s) =klog(s + bo) -s-b - log k\ . 



The second derivative of the logarithm of the marginal model is —k/ (s + bo) 2 such that the Fisher's 
information is 



/(,) 



^-logp(k\s] 



(s + b ) k 



k=0 



(s + bo) 2 k\ 



whose 0-th element is null, hence we can write a sum starting from k = 1 and then introduce a new 
index n = k — 1 : 



(s + bo) 



Jfc-2 



_ e - s - bo £ (s + bo)' 1 



s + b o ,H) " ! s + b o 
from which one finds the reference prior as k(s) <=c \I(s)\ 1 / 2 = (s + bo)~ l l 2 . 
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We now verify that one gets the same solution when defining j8 = a/ bo and considering the 
limit a — > o° in our model. First, let's consider the function f(s;k, a, j8): 

. ... £ r(a + n) ft" 
f{s;k,a,a/b ) = £ 



»=o 



(a) (jfc-n)!n!(io + a)" 



Let's call this limit f(s;k) = (s + bo) k jk\. Next, consider 



(s + frp)* 
it! 



1 + 



1 + 



Finally, in this limit the Fisher's information becomes 

/ 2 (*;«) 



/(,) 



„to /(>;«+ 1) 



1 



I 

n=0 



(*+&o)« 



(n + 1)! 
(s + 6 )" +1 



1 



The 0-th term in the sum is (s + bo) 1 and, from the 1-st term on, the factorial (n — 1)! is well 
defined, such that we can write 



/(,) 



s + b 



+ e 



-s-b 



y {s + b ) n - l n + \ 



1 



The sum above can be rewritten 



y (s + b Q f- l n + l _ y ( S + b ) m ^ i 



ml 



o s+b 



+ 1 



(s + b ) m 
^ m! (m + 1) 



m=0 



(5 + ^0 



\tn+\ 



(m+1)! 



nl 



2 s+i>o _ i 
s + b 



which inserted in I(s) gives 

/(,) 



e -s—bo 

s + b 
1 

s + b 



+ e~ 



gS+H + . 



? s+b _ y 

s + bo 
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exactly the same expression which one obtains when starting directly with the delta-function. 



E. Summaries of the solutions and their properties 

The following tables and plots provide a summary of the marginal posteriors for the signal in the 
cases of E[b] = 2 prior background expectation with relative uncertainty of 10%, 20%, 50% and 
100%, and of their coverage. In addition, the false exclusion rate and biases of the posterior mode, 
mean and median are shown. 

Tables |, |, | and | report the left and right bounds of the 68.3%, 90%, 95% credible intervals, 
plus mean, median, mode, variance, skewness and excess kurtosis. The coverage of such intervals 
has been studied for different observations (k = 0, 1, . . . , 15 counts) and the average over k is shown 
in figures ^, || |9| and Because of the limited range in k considered here, the coverage is to 
be intended as a first approximation when the sum of true signal and background exceeds 10- 
12 counts. In the 2D plots, the diagonal lines have constant sum of true signal and background, 
whereas the horizontal line identifies the slice which is shown in the corresponding ID plot. The 
horizontal line in ID plots shows the nominal coverage. 



Figures [12J, |13J, |14| [15J, |16J, and |17| show the false exclusion rate (which is the average fraction 
of times the true signal is above the 95% upper limit) and the bias of the posterior mode, mean 
and median (which is the difference between the estimator and the true signal), as a function of 
true signal and background (left plots) and as a function of the true signal alone, for the case in 
which the true background coincides with the prior expectation. Only the posteriors obtained with 
a prior background expectation E[b) =2 counts with relative uncertainties of 10%, 50% and 100% 
are shown. The empty region in the 2D plots of the false exclusion rate corresponds to identically 
zero exclusion rate: in this region the true signal is always smaller than the upper limit. 
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Table 3. Posterior summary for E [b] = 2 ± 0.2 as a 



function of the number of observed events. 



nuns 




t on 


T AO 


iviean 


Median 


ivioae 


P AO 
Kuo 


KVU 


Kyj 


Vari. 


oKcW. 


jvun. 


a 
U 


A AA 

u.uu 


A AA 

U.UU 


A AA 

U.UU 


A 00 

U.oo 


A AA 

U.dU 


A AA 

U.UU 


1 A1 

1.U1 


O AO 

Z.Uo 


O HQ 

Z.l a 


A 1 

U.ol 


O 1 A 
Z. 1U 


A HI 


1 


A AA 

U.UU 


A AA 

U.UU 


A AA 

U.UU 


1.1 / 


A QA 

U.o4 


A AA 
U.UU 


110 

l.jo 


O H 1 

Z. / 1 


J.J 1 


1 00 

l.Zo 


1 OA 
1.00 


< AO 


o 

Z 


A AA 

U.UU 


A AA 

U.UU 


A AA 

U.UU 


1 <Q 

i.jy 


1 O 1 
1.Z1 


A AA 

U.UU 


1 OA 

i.yu 


1 <A 

j.jU 


A /I O 

4.4Z 


1 oo 

i.yo 


1 AA 

l.oU 


1 A/1 
J.04 




A AO 

u.uy 


U. 1 / 


A <1 
U.J J 


O 1 *2 

Z.l j 


1 HA 

1. /4 


A <H 
U. J / 


1 H< 

J. Ij 


J.jU 


A /lO 


oo 

z.yz 


1 K 
L.JJ 


O <A 

Z.jU 


A 

4 


U.l / 


A 10 

U. jZ 


A 00 

U.oo 


O OQ 

Z.aj 


Z.44 


l.JJ 


A HH 
4. / / 


A AO 


O" TA 


A AA 

4.U0 


1 11 
1.1 J 


1. IV 


5 


0.33 


0.59 


1.39 


3.65 


3.28 


2.54 


5.88 


7.95 


9.09 


5.28 


0.95 


1.21 


6 


0.64 


1.02 


2.06 


4.56 


4.21 


3.53 


7.04 


9.25 


10.44 


6.47 


0.82 


0.93 


7 


1.12 


1.61 


2.81 


5.52 


5.18 


4.53 


8.21 


10.54 


11.79 


7.59 


0.73 


0.77 


8 


1.71 


2.28 


3.61 


6.50 


6.17 


5.52 


9.39 


11.83 


13.14 


8.65 


0.67 


0.68 


9 


2.37 


2.99 


4.44 


7.50 


7.17 


6.52 


10.55 


13.10 


14.46 


9.67 


0.63 


0.61 


10 


3.06 


3.73 


5.28 


8.50 


8.17 


7.52 


11.71 


14.37 


15.77 


10.68 


0.60 


0.55 


11 


3.77 


4.49 


6.13 


9.50 


9.17 


8.52 


12.86 


15.62 


17.07 


11.68 


0.58 


0.51 


12 


4.49 


5.25 


6.98 


10.50 


10.17 


9.51 


14.01 


16.85 


18.36 


12.68 


0.55 


0.47 


13 


5.22 


6.02 


7.84 


11.50 


11.17 


10.51 


15.15 


18.08 


19.63 


13.68 


0.53 


0.43 


14 


5.96 


6.80 


8.71 


12.50 


12.17 


11.51 


16.28 


19.30 


20.89 


14.67 


0.52 


0.40 


15 


6.71 


7.59 


9.58 


13.50 


13.17 


12.51 


17.41 


20.52 


22.14 


15.67 


0.50 


0.38 



Table 4. Posterior summary for E [b] = 2 ± 0.4 as a 



function of the number of observed events. 
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Table 5. Posterior summary for E[b] = 2 ± 1 as a 



function of the number of observed events. 
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Table 6. Posterior summary for E[b] = 2 ± 2 as a 



function of the number of observed events. 



Coverage Of the 68.3% interv. for E[b]=2 ± 1 0% } | Coverage of the 68.3% interv. tor b-2 and E[b]=2± 10% | 




H- .... I .... I .... I .... I .... I .... I .... I .... ■« 



true signal 



true signal 



I Coverage of the 95% interv. for E[b]=2 ± 1 0% 



Coverage of the 95% interv. for b=2 and E[b]=210% 



-| H— ' ■ ■ ■ | ■ i ■ ■ | ■ ■ ■ i | ■ ■ ■ ■ | i ■ ■ ■ | ■ i ■ ■ | ■ ■ ■ i | ■ ■ ■ I — | 




lr .... i .... i .... i .... i .... i .... i .... i .... il 



true signal 



true signal 



Figure 7. Coverage of the 95% (top), 90% (middle), 68.3% (bottom) posterior credible intervals for E[b] = 
2 ± 0.2 as a function of the true signal and background (left panel) and as a function of the true signal alone 
in case the true background coincides with the prior expectation (left panel). 
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Figure 8. Coverage of the 95% (top), 90% (middle), 68.3% (bottom) posterior credible intervals for E[b] = 
2 ± 0.4 as a function of the true signal and background (left panel) and as a function of the true signal alone 
in case the true background coincides with the prior expectation (left panel). 
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Figure 9. Coverage of the 95% (top), 90% (middle), 68.3% (bottom) posterior credible intervals for E[b] = 
2 ± 1 as a function of the true signal and background (left panel) and as a function of the true signal alone in 
case the true background coincides with the prior expectation (left panel). 
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H- .... I .... I .... I .... I .... I .... I .... I .... ■« 



true signal 



true signal 



Figure 10. Coverage of the 95% (top), 90% (middle), 68.3% (bottom) posterior credible intervals for E [b] = 
2 ± 2 as a function of the true signal and background (left panel) and as a function of the true signal alone in 
case the true background coincides with the prior expectation (left panel). 



-29- 



I Coverage of the 68.3% interv. for b=1 and E[b]=2 ± 1 00% [ 
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Figure 11. Coverage of the 95% (top), 90% (middle), 68.3% (bottom) posterior credible intervals for E [b] = 
2 ± 2 as a function of the true signal, when the true background is half of the background expectation (left) 
or 50% bigger (right). 
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Figure 12. Top: false exclusion rate as a function of the true signal and background (top-left) and of 
the true signal alone for the case in which the true background coincides with the prior expectation (top- 
right). Bottom: bias of the posterior mode as a function of the true signal and background (bottom-left) 
and of the true signal alone for the case in which the true background coincides with the prior expectation 
(bottom-right). The posterior corresponds to a prior background expectation of 2 counts with 10% relative 
uncertainty. 
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Figure 13. Bias of the posterior mean (top) and median (bottom) as a function of the true signal and 
background (left plots) and of the true signal alone for the case in which the true background coincides with 
the prior expectation (right plots). The posterior corresponds to a prior background expectation of 2 counts 
with 10% relative uncertainty. 
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Figure 14. Top: false exclusion rate as a function of the true signal and background (top-left) and of 
the true signal alone for the case in which the true background coincides with the prior expectation (top- 
right). Bottom: bias of the posterior mode as a function of the true signal and background (bottom-left) 
and of the true signal alone for the case in which the true background coincides with the prior expectation 
(bottom-right). The posterior corresponds to a prior background expectation of 2 counts with 50% relative 
uncertainty. 
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Figure 15. Bias of the posterior mean (top) and median (bottom) as a function of the true signal and 
background (left plots) and of the true signal alone for the case in which the true background coincides with 
the prior expectation (right plots). The posterior corresponds to a prior background expectation of 2 counts 
with 50% relative uncertainty. 
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Figure 16. Top: false exclusion rate as a function of the true signal and background (top-left) and of 
the true signal alone for the case in which the true background coincides with the prior expectation (top- 
right). Bottom: bias of the posterior mode as a function of the true signal and background (bottom-left) 
and of the true signal alone for the case in which the true background coincides with the prior expectation 
(bottom-right). The posterior corresponds to a prior background expectation of 2 counts with 100% relative 
uncertainty. 
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| Bias of the mean for E[b]=2 anda[b]=2, true bkg = 2.6\ 
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Figure 17. Bias of the posterior mean (top) and median (bottom) as a function of the true signal and 
background (left plots) and of the true signal alone for the case in which the true background coincides with 
the prior expectation (right plots). The posterior corresponds to a prior background expectation of 2 counts 
with 100% relative uncertainty. 
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